Overview

Dataset statistics

Number of variables16
Number of observations264960
Missing cells248517
Missing cells (%)5.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory32.3 MiB
Average record size in memory128.0 B

Variable types

NUM11
CAT5

Reproduction

Analysis started2020-11-16 03:25:57.343899
Analysis finished2020-11-16 03:27:02.412987
Duration1 minute and 5.07 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

Zipcode has a high cardinality: 33120 distinct values High cardinality
State has a high cardinality: 51 distinct values High cardinality
City has a high cardinality: 14740 distinct values High cardinality
Metro has a high cardinality: 861 distinct values High cardinality
CountyName has a high cardinality: 1759 distinct values High cardinality
med_hIncome is highly correlated with Year and 3 other fieldsHigh correlation
Year is highly correlated with med_hIncome and 3 other fieldsHigh correlation
uspop_growth is highly correlated with int_rateHigh correlation
int_rate is highly correlated with uspop_growthHigh correlation
unemplt_rate is highly correlated with Year and 3 other fieldsHigh correlation
newHouse_starts is highly correlated with Year and 3 other fieldsHigh correlation
resConstruct_spending is highly correlated with Year and 3 other fieldsHigh correlation
RentPrice has 19907 (7.5%) missing values Missing
SizeRank has 27056 (10.2%) missing values Missing
State has 27056 (10.2%) missing values Missing
City has 27056 (10.2%) missing values Missing
Metro has 83008 (31.3%) missing values Missing
CountyName has 27056 (10.2%) missing values Missing
HomePrice has 37378 (14.1%) missing values Missing
Zipcode is uniformly distributed Uniform
Vacancy_Rate% has 14908 (5.6%) zeros Zeros

Variables

Zipcode
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count33120
Unique (%)12.5%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
17021
 
8
73662
 
8
95616
 
8
02646
 
8
00983
 
8
Other values (33115)
264920
ValueCountFrequency (%) 
170218< 0.1%
 
736628< 0.1%
 
956168< 0.1%
 
026468< 0.1%
 
009838< 0.1%
 
950608< 0.1%
 
790888< 0.1%
 
657158< 0.1%
 
689288< 0.1%
 
304648< 0.1%
 
Other values (33110)264880> 99.9%
 
2020-11-15T22:27:03.994004image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

RentPrice
Real number (ℝ≥0)

MISSING

Distinct count153859
Unique (%)62.8%
Missing19907
Missing (%)7.5%
Infinite0
Infinite (%)0.0%
Mean1067.691442092119
Minimum19.960000000000036
Maximum5620.320000000002
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-11-15T22:27:04.223573image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum19.96
5-th percentile581.736
Q1781.625
median942.446
Q31204.816
95-th percentile1960.804
Maximum5620.32
Range5600.36
Interquartile range (IQR)423.191

Descriptive statistics

Standard deviation491.6269767
Coefficient of variation (CV)0.4604579163
Kurtosis13.77100961
Mean1067.691442
Median Absolute Deviation (MAD)195.09
Skewness2.826552991
Sum261640991
Variance241697.0842
2020-11-15T22:27:04.378065image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
681.7366000.2%
 
731.7365520.2%
 
631.7365360.2%
 
1281.7365280.2%
 
831.7365030.2%
 
581.7364910.2%
 
781.7364900.2%
 
1006.7364870.2%
 
881.7363660.1%
 
1106.7363400.1%
 
Other values (153849)24016090.6%
 
(Missing)199077.5%
 
ValueCountFrequency (%) 
19.967< 0.1%
 
94.968< 0.1%
 
103.291< 0.1%
 
133.951< 0.1%
 
139.41< 0.1%
 
ValueCountFrequency (%) 
5620.325< 0.1%
 
5619.7952< 0.1%
 
5616.463< 0.1%
 
5563.032< 0.1%
 
5558.206125< 0.1%
 

Year
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2014.5
Minimum2011
Maximum2018
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-11-15T22:27:04.636537image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum2011
5-th percentile2011
Q12012.75
median2014.5
Q32016.25
95-th percentile2018
Maximum2018
Range7
Interquartile range (IQR)3.5

Descriptive statistics

Standard deviation2.291292171
Coefficient of variation (CV)0.001137399936
Kurtosis-1.238095957
Mean2014.5
Median Absolute Deviation (MAD)2
Skewness0
Sum533761920
Variance5.250019814
2020-11-15T22:27:04.815333image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
20183312012.5%
 
20173312012.5%
 
20163312012.5%
 
20153312012.5%
 
20143312012.5%
 
20133312012.5%
 
20123312012.5%
 
20113312012.5%
 
ValueCountFrequency (%) 
20113312012.5%
 
20123312012.5%
 
20133312012.5%
 
20143312012.5%
 
20153312012.5%
 
ValueCountFrequency (%) 
20183312012.5%
 
20173312012.5%
 
20163312012.5%
 
20153312012.5%
 
20143312012.5%
 

SizeRank
Real number (ℝ≥0)

MISSING

Distinct count11054
Unique (%)4.6%
Missing27056
Missing (%)10.2%
Infinite0
Infinite (%)0.0%
Mean15646.706806106666
Minimum0.0
Maximum34430.0
Zeros8
Zeros (%)< 0.1%
Memory size2.0 MiB
2020-11-15T22:27:04.996037image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1503
Q17531
median15164
Q323514
95-th percentile31180
Maximum34430
Range34430
Interquartile range (IQR)15983

Descriptive statistics

Standard deviation9424.124602
Coefficient of variation (CV)0.6023072279
Kurtosis-1.122786021
Mean15646.70681
Median Absolute Deviation (MAD)7971.5
Skewness0.1321156224
Sum3722414136
Variance88814124.52
2020-11-15T22:27:05.155194image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
299642880.1%
 
305452800.1%
 
320622800.1%
 
280972480.1%
 
296852480.1%
 
308922480.1%
 
289042480.1%
 
294392400.1%
 
305042320.1%
 
287842320.1%
 
Other values (11044)23536088.8%
 
(Missing)2705610.2%
 
ValueCountFrequency (%) 
08< 0.1%
 
18< 0.1%
 
28< 0.1%
 
38< 0.1%
 
48< 0.1%
 
ValueCountFrequency (%) 
344301840.1%
 
34322128< 0.1%
 
3430224< 0.1%
 
342728< 0.1%
 
342588< 0.1%
 

State
Categorical

HIGH CARDINALITY
MISSING

Distinct count51
Unique (%)< 0.1%
Missing27056
Missing (%)10.2%
Memory size2.0 MiB
TX
 
14080
NY
 
13584
CA
 
13304
PA
 
13048
IL
 
10152
Other values (46)
173736
ValueCountFrequency (%) 
TX140805.3%
 
NY135845.1%
 
CA133045.0%
 
PA130484.9%
 
IL101523.8%
 
OH92803.5%
 
FL75442.8%
 
MI75042.8%
 
MO74322.8%
 
IA73842.8%
 
Other values (41)13459250.8%
 
(Missing)2705610.2%
 
2020-11-15T22:27:06.941538image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length3
Median length2
Mean length2.102113527
Min length2

City
Categorical

HIGH CARDINALITY
MISSING

Distinct count14740
Unique (%)6.2%
Missing27056
Missing (%)10.2%
Memory size2.0 MiB
New York
 
1368
Houston
 
856
Los Angeles
 
800
San Antonio
 
448
Chicago
 
440
Other values (14735)
233992
ValueCountFrequency (%) 
New York13680.5%
 
Houston8560.3%
 
Los Angeles8000.3%
 
San Antonio4480.2%
 
Chicago4400.2%
 
Springfield4320.2%
 
Dallas4160.2%
 
Columbus4080.2%
 
Kansas City4000.2%
 
Philadelphia3920.1%
 
Other values (14730)23194487.5%
 
(Missing)2705610.2%
 
2020-11-15T22:27:08.518304image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length30
Median length8
Mean length8.448641304
Min length3

Metro
Categorical

HIGH CARDINALITY
MISSING

Distinct count861
Unique (%)0.5%
Missing83008
Missing (%)31.3%
Memory size2.0 MiB
New York-Newark-Jersey City
 
7416
Chicago-Naperville-Elgin
 
3056
Los Angeles-Long Beach-Anaheim
 
2896
Philadelphia-Camden-Wilmington
 
2832
Washington-Arlington-Alexandria
 
2552
Other values (856)
163200
ValueCountFrequency (%) 
New York-Newark-Jersey City74162.8%
 
Chicago-Naperville-Elgin30561.2%
 
Los Angeles-Long Beach-Anaheim28961.1%
 
Philadelphia-Camden-Wilmington28321.1%
 
Washington-Arlington-Alexandria25521.0%
 
Pittsburgh25441.0%
 
Boston-Cambridge-Newton22080.8%
 
Dallas-Fort Worth-Arlington21120.8%
 
Houston-The Woodlands-Sugar Land18880.7%
 
Minneapolis-St. Paul-Bloomington18400.7%
 
Other values (851)15260857.6%
 
(Missing)8300831.3%
 
2020-11-15T22:27:10.076234image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length42
Median length9
Mean length12.30697464
Min length3

CountyName
Categorical

HIGH CARDINALITY
MISSING

Distinct count1759
Unique (%)0.7%
Missing27056
Missing (%)10.2%
Memory size2.0 MiB
Washington County
 
2848
Jefferson County
 
2600
Los Angeles County
 
2200
Franklin County
 
2120
Montgomery County
 
2120
Other values (1754)
226016
ValueCountFrequency (%) 
Washington County28481.1%
 
Jefferson County26001.0%
 
Los Angeles County22000.8%
 
Franklin County21200.8%
 
Montgomery County21200.8%
 
Jackson County18640.7%
 
Orange County17600.7%
 
Marion County14560.5%
 
Wayne County14080.5%
 
Monroe County14080.5%
 
Other values (1749)21812082.3%
 
(Missing)2705610.2%
 
2020-11-15T22:27:12.507370image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length29
Median length14
Mean length13.07388285
Min length3

HomePrice
Real number (ℝ≥0)

MISSING

Distinct count220057
Unique (%)96.7%
Missing37378
Missing (%)14.1%
Infinite0
Infinite (%)0.0%
Mean184668.82838348375
Minimum10421.83
Maximum6141945.92
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-11-15T22:27:12.819198image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum10421.83
5-th percentile50137.775
Q187356.0825
median134018.5
Q3214174.6025
95-th percentile481902.003
Maximum6141945.92
Range6131524.09
Interquartile range (IQR)126818.52

Descriptive statistics

Standard deviation185822.1927
Coefficient of variation (CV)1.006245582
Kurtosis65.51499059
Mean184668.8284
Median Absolute Deviation (MAD)55750.125
Skewness5.668235713
Sum4.20273013e+10
Variance3.452988731e+10
2020-11-15T22:27:12.983926image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
75169.834< 0.1%
 
54673.674< 0.1%
 
110771.674< 0.1%
 
886294< 0.1%
 
812724< 0.1%
 
57537.174< 0.1%
 
140845.333< 0.1%
 
112880.333< 0.1%
 
64628.333< 0.1%
 
73042.333< 0.1%
 
Other values (220047)22754685.9%
 
(Missing)3737814.1%
 
ValueCountFrequency (%) 
10421.831< 0.1%
 
10956.331< 0.1%
 
116881< 0.1%
 
11860.831< 0.1%
 
12041.421< 0.1%
 
ValueCountFrequency (%) 
6141945.921< 0.1%
 
5373670.921< 0.1%
 
5197037.171< 0.1%
 
4928414.671< 0.1%
 
4771183.921< 0.1%
 

Vacancy_Rate%
Real number (ℝ≥0)

ZEROS

Distinct count175367
Unique (%)66.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.675088463689917
Minimum0.0
Maximum100.0
Zeros14908
Zeros (%)5.6%
Memory size2.0 MiB
2020-11-15T22:27:13.231360image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q16.977963141
median12.79997659
Q322.58064516
95-th percentile52.63157895
Maximum100
Range100
Interquartile range (IQR)15.60268202

Descriptive statistics

Standard deviation16.4379872
Coefficient of variation (CV)0.9300087651
Kurtosis4.626284265
Mean17.67508846
Median Absolute Deviation (MAD)6.940601589
Skewness1.963426228
Sum4683191.439
Variance270.207423
2020-11-15T22:27:13.374945image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0149085.6%
 
1007120.3%
 
203130.1%
 
252860.1%
 
33.333333332660.1%
 
16.666666672630.1%
 
14.285714292180.1%
 
501980.1%
 
12.51900.1%
 
11.111111111760.1%
 
Other values (175357)24743093.4%
 
ValueCountFrequency (%) 
0149085.6%
 
0.022727272731< 0.1%
 
0.11148272021< 0.1%
 
0.12484394511< 0.1%
 
0.14025245441< 0.1%
 
ValueCountFrequency (%) 
1007120.3%
 
99.839743591< 0.1%
 
99.717912551< 0.1%
 
99.653379551< 0.1%
 
99.573863641< 0.1%
 

int_rate
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.109375
Minimum0.75
Maximum2.458333333333333
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-11-15T22:27:13.526365image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0.75
5-th percentile0.75
Q10.75
median0.7604166667
Q31.171875
95-th percentile2.458333333
Maximum2.458333333
Range1.708333333
Interquartile range (IQR)0.421875

Descriptive statistics

Standard deviation0.5835901449
Coefficient of variation (CV)0.5260530884
Kurtosis0.7307521433
Mean1.109375
Median Absolute Deviation (MAD)0.01041666667
Skewness1.488402739
Sum293940
Variance0.3405774573
2020-11-15T22:27:13.693939image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.7513248050.0%
 
1.0208333333312012.5%
 
2.4583333333312012.5%
 
1.6253312012.5%
 
0.77083333333312012.5%
 
ValueCountFrequency (%) 
0.7513248050.0%
 
0.77083333333312012.5%
 
1.0208333333312012.5%
 
1.6253312012.5%
 
2.4583333333312012.5%
 
ValueCountFrequency (%) 
2.4583333333312012.5%
 
1.6253312012.5%
 
1.0208333333312012.5%
 
0.77083333333312012.5%
 
0.7513248050.0%
 

med_hIncome
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean60351.0
Minimum56912.0
Maximum64324.0
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-11-15T22:27:13.867678image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum56912
5-th percentile56912
Q157756
median59945.5
Q363113.75
95-th percentile64324
Maximum64324
Range7412
Interquartile range (IQR)5357.75

Descriptive statistics

Standard deviation2846.855913
Coefficient of variation (CV)0.04717164443
Kurtosis-1.621558691
Mean60351
Median Absolute Deviation (MAD)2938.5
Skewness0.1383636684
Sum1.599060096e+10
Variance8104588.588
2020-11-15T22:27:14.023918image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
643243312012.5%
 
637613312012.5%
 
628983312012.5%
 
609873312012.5%
 
589043312012.5%
 
580013312012.5%
 
570213312012.5%
 
569123312012.5%
 
ValueCountFrequency (%) 
569123312012.5%
 
570213312012.5%
 
580013312012.5%
 
589043312012.5%
 
609873312012.5%
 
ValueCountFrequency (%) 
643243312012.5%
 
637613312012.5%
 
628983312012.5%
 
609873312012.5%
 
589043312012.5%
 

uspop_growth
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6827791724977739
Minimum0.5223373578996761
Maximum0.730641178178307
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-11-15T22:27:14.213427image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0.5223373579
5-th percentile0.5223373579
Q10.67283184
median0.718343551
Q30.7273311718
95-th percentile0.7306411782
Maximum0.7306411782
Range0.2083038203
Interquartile range (IQR)0.05449933187

Descriptive statistics

Standard deviation0.06823199467
Coefficient of variation (CV)0.09993274168
Kurtosis0.9576098655
Mean0.6827791725
Median Absolute Deviation (MAD)0.01073588595
Skewness-1.531094737
Sum180909.1695
Variance0.004655605096
2020-11-15T22:27:14.354344image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.73064117823312012.5%
 
0.72001768873312012.5%
 
0.72726899723312012.5%
 
0.63100789323312012.5%
 
0.71666941343312012.5%
 
0.52233735793312012.5%
 
0.72751769583312012.5%
 
0.68677315563312012.5%
 
ValueCountFrequency (%) 
0.52233735793312012.5%
 
0.63100789323312012.5%
 
0.68677315563312012.5%
 
0.71666941343312012.5%
 
0.72001768873312012.5%
 
ValueCountFrequency (%) 
0.73064117823312012.5%
 
0.72751769583312012.5%
 
0.72726899723312012.5%
 
0.72001768873312012.5%
 
0.71666941343312012.5%
 

unemplt_rate
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.113541666666668
Minimum3.8916666666666666
Maximum8.933333333333334
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-11-15T22:27:14.524052image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum3.891666667
5-th percentile3.891666667
Q14.741666667
median5.716666667
Q37.5375
95-th percentile8.933333333
Maximum8.933333333
Range5.041666667
Interquartile range (IQR)2.795833333

Descriptive statistics

Standard deviation1.719867468
Coefficient of variation (CV)0.2813209693
Kurtosis-1.321304391
Mean6.113541667
Median Absolute Deviation (MAD)1.508333333
Skewness0.3163534165
Sum1619844
Variance2.957944106
2020-11-15T22:27:14.685236image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
5.2753312012.5%
 
6.1583333333312012.5%
 
7.3583333333312012.5%
 
8.0753312012.5%
 
3.8916666673312012.5%
 
8.9333333333312012.5%
 
4.3416666673312012.5%
 
4.8753312012.5%
 
ValueCountFrequency (%) 
3.8916666673312012.5%
 
4.3416666673312012.5%
 
4.8753312012.5%
 
5.2753312012.5%
 
6.1583333333312012.5%
 
ValueCountFrequency (%) 
8.9333333333312012.5%
 
8.0753312012.5%
 
7.3583333333312012.5%
 
6.1583333333312012.5%
 
5.2753312012.5%
 

newHouse_starts
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1007.8854166666666
Minimum611.9166666666666
Maximum1248.25
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-11-15T22:27:14.843032image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum611.9166667
5-th percentile611.9166667
Q1892.0625
median1053.5
Q31184.291667
95-th percentile1248.25
Maximum1248.25
Range636.3333333
Interquartile range (IQR)292.2291667

Descriptive statistics

Standard deviation208.9448726
Coefficient of variation (CV)0.2073101458
Kurtosis-0.8373471732
Mean1007.885417
Median Absolute Deviation (MAD)139.625
Skewness-0.6338115257
Sum267049320
Variance43657.9598
2020-11-15T22:27:15.005985image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1000.253312012.5%
 
1176.5833333312012.5%
 
1207.4166673312012.5%
 
611.91666673312012.5%
 
783.753312012.5%
 
1248.253312012.5%
 
928.16666673312012.5%
 
1106.753312012.5%
 
ValueCountFrequency (%) 
611.91666673312012.5%
 
783.753312012.5%
 
928.16666673312012.5%
 
1000.253312012.5%
 
1106.753312012.5%
 
ValueCountFrequency (%) 
1248.253312012.5%
 
1207.4166673312012.5%
 
1176.5833333312012.5%
 
1106.753312012.5%
 
1000.253312012.5%
 

resConstruct_spending
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean410836.1979166665
Minimum255208.58333333328
Maximum564448.75
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-11-15T22:27:15.175298image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum255208.5833
5-th percentile255208.5833
Q1321154.3958
median410493.3333
Q3500871.9167
95-th percentile564448.75
Maximum564448.75
Range309240.1667
Interquartile range (IQR)179717.5208

Descriptive statistics

Standard deviation109740.0191
Coefficient of variation (CV)0.2671138026
Kurtosis-1.409802147
Mean410836.1979
Median Absolute Deviation (MAD)103413.4583
Skewness0.002059248536
Sum1.08855159e+11
Variance1.204287178e+10
2020-11-15T22:27:15.319078image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
546020.16673312012.5%
 
278995.58333312012.5%
 
485822.53312012.5%
 
335207.33333312012.5%
 
382868.33333312012.5%
 
438118.33333312012.5%
 
564448.753312012.5%
 
255208.58333312012.5%
 
ValueCountFrequency (%) 
255208.58333312012.5%
 
278995.58333312012.5%
 
335207.33333312012.5%
 
382868.33333312012.5%
 
438118.33333312012.5%
 
ValueCountFrequency (%) 
564448.753312012.5%
 
546020.16673312012.5%
 
485822.53312012.5%
 
438118.33333312012.5%
 
382868.33333312012.5%
 

Interactions

2020-11-15T22:26:17.486296image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:17.798055image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:18.100629image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:18.423807image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:18.739614image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:19.030020image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:19.312495image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:19.821246image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:20.138140image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:20.414260image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:20.730960image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:21.065893image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:21.394266image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:21.730732image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:22.058249image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:22.401245image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:22.703190image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:23.062787image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:23.380997image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:23.769694image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:24.190607image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:24.556059image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:24.838883image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:25.121806image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:25.494224image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:25.793740image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:26.076422image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:26.351415image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:26.642813image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:26.925341image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:27.191290image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:27.466030image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:27.745692image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:28.036972image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:28.310787image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:28.586445image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:29.020453image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:29.303803image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:29.584006image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:29.860003image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:30.144509image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:30.405430image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:30.676817image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:30.947025image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:31.232703image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:31.514610image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:31.784980image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:32.081485image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:32.385612image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:32.670844image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:32.952136image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:33.241006image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:33.599748image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:33.876789image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:34.157535image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:34.516452image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:34.832992image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:35.122565image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:35.421509image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:35.712950image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:36.019233image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:36.331718image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:36.669900image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:36.973916image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:37.266736image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:37.590938image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:37.917624image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:38.280809image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:38.630739image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:38.988421image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:39.318434image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:39.596055image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:40.107784image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:40.468263image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:40.824532image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:41.125936image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:41.437076image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:41.974348image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:42.295125image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:42.638208image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:42.905870image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:43.220601image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:43.552609image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:43.856895image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:44.129896image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:44.414549image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:44.691457image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:45.015292image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:45.316567image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:45.593529image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:45.943745image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:46.277182image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:46.562919image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:46.858097image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:47.162240image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:47.487112image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:47.770086image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:48.040103image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:48.318298image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:48.599773image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:48.879686image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:49.170886image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:49.456386image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:49.753927image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:50.037374image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:50.330142image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:50.620858image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:50.902096image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:51.185493image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:51.486204image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:51.834333image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:52.217278image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:52.503897image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:52.785908image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:53.103346image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:53.431377image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:53.873298image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:54.442407image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:55.515288image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:55.907985image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:56.198236image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Correlations

2020-11-15T22:27:15.486577image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-11-15T22:27:15.789979image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-11-15T22:27:16.078710image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-11-15T22:27:16.378135image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-11-15T22:26:57.049171image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:26:58.115325image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:27:01.073699image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-15T22:27:01.742520image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Sample

First rows

ZipcodeRentPriceYearSizeRankStateCityMetroCountyNameHomePriceVacancy_Rate%int_ratemed_hIncomeuspop_growthunemplt_ratenewHouse_startsresConstruct_spending
0023331368.53620118782.0MAEast BridgewaterBoston-Cambridge-NewtonPlymouth CountyNaN3.0240270.7557021.00.7200188.933333611.916667255208.583333
1023381311.076201111179.0MAHalifaxBoston-Cambridge-NewtonPlymouth County274920.173.1163430.7557021.00.7200188.933333611.916667255208.583333
2023391484.62620118621.0MAHanoverBoston-Cambridge-NewtonPlymouth County415097.504.4646460.7557021.00.7200188.933333611.916667255208.583333
3023411266.816201110079.0MAHansonBoston-Cambridge-NewtonPlymouth CountyNaN3.5863220.7557021.00.7200188.933333611.916667255208.583333
4023431524.00620119640.0MAHolbrookBoston-Cambridge-NewtonNorfolk County247510.423.7329010.7557021.00.7200188.933333611.916667255208.583333
5023461310.01620115289.0MAMiddleboroughBoston-Cambridge-NewtonPlymouth County264492.507.9602560.7557021.00.7200188.933333611.916667255208.583333
6023471307.73620119579.0MALakevilleBoston-Cambridge-NewtonPlymouth County309743.6711.5659680.7557021.00.7200188.933333611.916667255208.583333
7023511399.92620117293.0MAAbingtonBoston-Cambridge-NewtonPlymouth County279614.925.4551220.7557021.00.7200188.933333611.916667255208.583333
8023561753.95620119084.0MAEastonProvidence-WarwickBristol County371979.422.8499200.7557021.00.7200188.933333611.916667255208.583333
902357581.7362011NaNNaNNaNNaNNaNNaN0.0000000.7557021.00.7200188.933333611.916667255208.583333

Last rows

ZipcodeRentPriceYearSizeRankStateCityMetroCountyNameHomePriceVacancy_Rate%int_ratemed_hIncomeuspop_growthunemplt_ratenewHouse_startsresConstruct_spending
264950981341909.58201828159.0WASeattleSeattle-Tacoma-BellevueKing County438970.0013.5802472.45833364324.00.5223373.8916671248.25564448.75
26495198174NaN2018NaNNaNNaNNaNNaNNaN0.0000002.45833364324.00.5223373.8916671248.25564448.75
26495298222NaN201830981.0WAOlgaNaNSan Juan County580646.5883.4710742.45833364324.00.5223373.8916671248.25564448.75
264953982331413.8920187640.0WABurlingtonMount Vernon-AnacortesSkagit County317426.754.8537652.45833364324.00.5223373.8916671248.25564448.75
264954982431302.942018NaNNaNNaNNaNNaNNaN57.2932332.45833364324.00.5223373.8916671248.25564448.75
264955982791059.87201823400.0WAOlgaNaNSan Juan County552805.4251.2195122.45833364324.00.5223373.8916671248.25564448.75
26495698280993.85201825265.0WAEastsoundNaNSan Juan County678499.0051.3292432.45833364324.00.5223373.8916671248.25564448.75
264957983111533.5020184981.0WABremertonBremerton-SilverdaleKitsap County314320.836.5401622.45833364324.00.5223373.8916671248.25564448.75
26495898326778.99201826185.0WAClallam BayPort AngelesClallam County150193.1728.5377362.45833364324.00.5223373.8916671248.25564448.75
264959983321840.8620186759.0WAGig HarborSeattle-Tacoma-BellevuePierce County535136.757.3400772.45833364324.00.5223373.8916671248.25564448.75